MOSAIC: Agglomerative Clustering with Gabriel Graphs

نویسندگان

  • Rachsuda Jiamthapthaksin
  • Jiyeon Choo
  • Chun-sheng Chen
  • Oner Ulvi Celepcikay
  • Christian Giusti
  • Christoph F. Eick
چکیده

Representative-based clustering algorithms are quite popular due to their relative high speed and because of their sound theoretical foundation. On the other hand, the clusters they can obtain are limited to convex shapes and clustering results are also highly sensitive to initializations. In this paper, a novel agglomerative clustering algorithm called MOSAIC is proposed which greedily merges neighboring clusters maximizing a given fitness function. MOSAIC uses Gabriel graphs to determine which clusters are neighboring and approximates non-convex shapes as the unions of small clusters that have been computed using a representative-based clustering algorithm. We evaluate MOSAIC for traditional unsupervised clustering with kmeans and DBSCAN, and also for supervised clustering. The experimental results show that this technique leads to clusters of higher quality compared to running a representative clustering algorithm stand-alone. Given a suitable fitness function, MOSAIC is able to detect arbitrary shape clusters which are comparable to the ones generated by DBSCAN. In addition, MOSAIC is capable of dealing with high dimensional data. We also claim that MOSAIC can be employed as an effective post-processing clustering algorithm to further improve the quality of clustering.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MOSAIC: A Proximity Graph Approach for Agglomerative Clustering

Representative-based clustering algorithms are quite popular due to their relative high speed and because of their sound theoretical foundation. On the other hand, the clusters they can obtain are limited to convex shapes and clustering results are also highly sensitive to initializations. In this paper, a novel agglomerative clustering algorithm called MOSAIC is proposed which greedily merges ...

متن کامل

2 Review of Agglomerative Hierarchical Clustering Algorithms

Hierarchical methods are well known clustering technique that can be potentially very useful for various data mining tasks. A hierarchical clustering scheme produces a sequence of clusterings in which each clustering is nested into the next clustering in the sequence. Since hierarchical clustering is a greedy search algorithm based on a local search, the merging decision made early in the agglo...

متن کامل

Multilevel Refinement for Hierarchical Clustering

Hierarchical methods are well known clustering technique that can be potentially very useful for various data mining tasks. A hierarchical clustering scheme produces a sequence of clusterings in which each clustering is nested into the next clustering in the sequence. Since hierarchical clustering is a greedy search algorithm based on a local search, the merging decision made early in the agglo...

متن کامل

A Survey on Efficient Clustering Methods with Effective Pruning Techniques for Probabilistic Graphs

This paper provides a survey on K-NN queries, DCR query, agglomerative complete linkage clustering and Extension of edit-distance-based definition graph algorithm and solving decision problems under uncertainty. This existing system give an beginning to Graph agglomeration aims to divide information into clusters per their similarities, and variety of algorithms are planned for agglomeration gr...

متن کامل

A Survey on Efficient Clustering Methods with Effective Pruning Techniques for Probabilistic Graphs

This paper provides a survey on K-NN queries, DCR query, agglomerative complete linkage clustering and Extension of edit-distance-based definition graph algorithm and solving decision problems under uncertainty. This existing system give an beginning to Graph agglomeration aims to divide information into clusters per their similarities, and variety of algorithms are planned for agglomeration gr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007